skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Durumeric, Zakir"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available August 12, 2026
  2. Free, publicly-accessible full text available May 12, 2026
  3. Machine learning has shown tremendous potential for improving the capabilities of network traffic analysis applications, often outperforming simpler rule-based heuristics. However, ML-based solutions remain difficult to deploy in practice. Many existing approaches only optimize the predictive performance of their models, overlooking the practical challenges of running them against network traffic in real time. This is especially problematic in the domain of traffic analysis, where the efficiency of the serving pipeline is a critical factor in determining the usability of a model. In this work, we introduce CATO, a framework that addresses this problem by jointly optimizing the predictive performance and the associated systems costs of the serving pipeline. CATO leverages recent advances in multi-objective Bayesian optimization to efficiently identify Pareto-optimal configurations, and automatically compiles end-to-end optimized serving pipelines that can be deployed in real networks. Our evaluations show that compared to popular feature optimization techniques, CATO can provide up to 3600× lower inference latency and 3.7× higher zero-loss throughput while simultaneously achieving better model performance. 
    more » « less
    Free, publicly-accessible full text available April 28, 2026
  4. Since ZMap’s debut in 2013, networking and security researchers have used the open-source scanner to write hundreds of research papers that study Internet behavior. In addition, ZMap has been adopted by the security industry to build new classes of enterprise security and compliance products. Over the past decade, much of ZMap’s behavior—ranging from its pseudorandom IP generation to its packet construction—has evolved as we have learned more about how to scan the Internet. In this work, we quantify ZMap’s adoption over the ten years since its release, describe its modern behavior (and the measurements that motivated changes), and offer lessons from releasing and maintaining ZMap for future tools. 
    more » « less
    Free, publicly-accessible full text available November 4, 2025
  5. Virtual Private Networks (VPNs) are increasingly being used to protect online users’ privacy and security. However, there is an ongoing arms race between censors that aim to detect and block VPN usage, and VPN providers that aim to obfuscate their services from these censors. In this paper, we explore the feasibility of a simple, protocol-agnostic VPN detection technique based on identifying encapsulated TCP behaviors in UDP-based tunnels. We derive heuristics to distinguish TCP-over-UDP VPN traffic from plain UDP traffic using RFC-defined TCP behaviors. Our evaluations on realworld traffic show that this technique can achieve a false positive rate (FPR) of 0.11%, an order of magnitude lower than existing machine learning-based VPN detection methods. We suggest defenses to evade our detection technique and encourage VPN providers to proactively defend against such attacks. 
    more » « less
  6. Social scientists and computer scientists are increasingly using observational digital trace data and analyzing these data post hoc to understand the content people are exposed to online. However, these content collection efforts may be systematically biased when the entirety of the data cannot be captured retroactively. We call this often unstated assumption the problematic assumption of accessibility. To examine the extent to which this assumption may be problematic, we identify 107k hard news and misinformation web pages visited by a representative panel of 1,238 American adults and record the degree to which the web pages individuals visited were accessible via successful web scrapes or inaccessible via unsuccessful scrapes. While we find that the URLs collected are largely accessible and with unrestricted content, we find there are systematic biases in which URLs are restricted, return an error, or are inaccessible. For example, conservative misinformation URLs are more likely to be inaccessible than other types of misinformation. We suggest how social scientists should capture and report digital trace and web scraping data. 
    more » « less
  7. In 2019, the US Department of Homeland Security issued an emergency warning about DNS infrastructure tampering. This alert, in response to a series of attacks against foreign government websites, highlighted how a sophisticated attacker could leverage access to key DNS infrastructure to then hijack traffic and harvest valid login credentials for target organizations. However, even armed with this knowledge, identifying the existence of such incidents has been almost entirely via post hoc forensic reports (i.e., after a breach was found via some other method). Indeed, such attacks are particularly challenging to detect because they can be very short lived, bypass the protections of TLS and DNSSEC, and are imperceptible to users. Identifying them retroactively is even more complicated by the lack of fine-grained Internet-scale forensic data. This paper is a first attempt to make progress at this latter goal. Combining a range of longitudinal data from Internet-wide scans, passive DNS records, and Certificate Transparency logs, we have constructed a methodology for identifying potential victims of sophisticated DNS infrastructure hijacking and have used it to identify a range of victims (primarily government agencies), both those named in prior reporting, and others previously unknown. 
    more » « less
  8. null (Ed.)
    We argue that existing security, privacy, and anti-abuse protections fail to address the growing threat of online hate and harassment. In order for our community to understand and address this gap, we propose a taxonomy for reasoning about online hate and harassment. Our taxonomy draws on over 150 interdisciplinary research papers that cover disparate threats ranging from intimate partner violence to coordinated mobs. In the process, we identify seven classes of attacks—such as toxic content and surveillance—that each stem from different attacker capabilities and intents. We also provide longitudinal evidence from a three-year survey that hate and harassment is a pervasive, growing experience for online users, particularly for at-risk communities like young adults and people who identify as LGBTQ+. Responding to each class of hate and harassment requires a unique strategy and we highlight five such potential research directions that ultimately empower individuals, communities, and platforms to do so. 
    more » « less